Search results for "Floating point"

showing 7 items of 7 documents

Nonlinear systems solver in floating-point arithmetic using LP reduction

2009

This paper presents a new solver for systems of nonlinear equations. Such systems occur in Geometric Constraint Solving, e.g., when dimensioning parts in CAD-CAM, or when computing the topology of sets defined by nonlinear inequalities. The paper does not consider the problem of decomposing the system and assembling solutions of subsystems. It focuses on the numerical resolution of well-constrained systems. Instead of computing an exponential number of coefficients in the tensorial Bernstein basis, we resort to linear programming for computing range bounds of system equations or domain reductions of system variables. Linear programming is performed on a so called Bernstein polytope: though,…

Discrete mathematicsNonlinear systemPolynomialFloating pointSimplexLinear programmingApplied mathematicsSolverBernstein polynomialMathematicsInterval arithmetic2009 SIAM/ACM Joint Conference on Geometric and Physical Modeling

researchProduct

A dynamic program analysis to find floating-point accuracy problems

2012

Programs using floating-point arithmetic are prone to accuracy problems caused by rounding and catastrophic cancellation. These phenomena provoke bugs that are notoriously hard to track down: the program does not necessarily crash and the results are not necessarily obviously wrong, but often subtly inaccurate. Further use of these values can lead to catastrophic errors.In this paper, we present a dynamic program analysis that supports the programmer in finding accuracy problems. Our analysis uses binary translation to perform every floating-point computation side by side in higher precision. Furthermore, we use a lightweight slicing approach to track the evolution of errors.We evaluate our…

Floating pointComputer engineeringComputer scienceComputationRoundingReal-time computingBinary translationDynamic program analysisBenchmark (computing)ProgrammerProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation

researchProduct

Efficient and portable acceleration of quantum chemical many-body methods in mixed floating point precision using OpenACC compiler directives

2016

It is demonstrated how the non-proprietary OpenACC standard of compiler directives may be used to compactly and efficiently accelerate the rate-determining steps of two of the most routinely applied many-body methods of electronic structure theory, namely the second-order M{\o}ller-Plesset (MP2) model in its resolution-of-the-identity (RI) approximated form and the (T) triples correction to the coupled cluster singles and doubles model (CCSD(T)). By means of compute directives as well as the use of optimized device math libraries, the operations involved in the energy kernels have been ported to graphics processing unit (GPU) accelerators, and the associated data transfers correspondingly o…

Floating pointComputer scienceBiophysicsGraphics processing unitFOS: Physical sciences010402 general chemistrycomputer.software_genre01 natural sciencesPortingSingle-precision floating-point formatComputational sciencePhysics - Chemical Physics0103 physical sciencesPhysical and Theoretical ChemistryMolecular BiologyChemical Physics (physics.chem-ph)010304 chemical physicsComputational Physics (physics.comp-ph)Condensed Matter Physics0104 chemical sciencesNode (circuits)CompilerCentral processing unitHost (network)computerPhysics - Computational Physics

researchProduct

LARGE-SCALE SIMULATIONS IN CONDENSED MATTER PHYSICS —THE NEED FOR A TERAFLOP COMPUTER

1992

The introduction of vector processors {“supercomputers” with a performance in the range of 109 floating point operations (1 GFLOP) per second} has had an enormous impact on computational condensed matter physics. The possibility of a substantially enhanced performance by massively parallel processors (“teraflop” machines with 1012 floating point operations per second) will allow satisfactory treatment of a large range of important scientific problems which have to a great extent thus far escaped numerical resolution. The present paper describes only a few examples (out of a long list of interesting research problems!) for which the availability of “teraflops” will allow spectacular progres…

Floating pointCondensed matter physicsComputer scienceScale (chemistry)Monte Carlo methodGeneral Physics and AstronomyStatistical and Nonlinear PhysicsParallel computingLarge rangeFLOPSComputer Science ApplicationsMetallic alloyRange (mathematics)Computational Theory and MathematicsMassively parallelMathematical PhysicsInternational Journal of Modern Physics C

researchProduct

Hardware-efficient matrix inversion algorithm for complex adaptive systems

2012

This work shows an FPGA implementation for the matrix inversion algebra operation. Usually, large matrix dimension is required for real-time signal processing applications, especially in case of complex adaptive systems. A hardware efficient matrix inversion procedure is described using QR decomposition of the original matrix and modified Gram-Schmidt method. This works attempts a direct VHDL description using few predefined packages and fixed point arithmetic for better optimization. New proposals for intermediate calculations are described, leading to efficient logic occupation together with better performance and accuracy in the vector space algebra. Results show that, for a relatively s…

Floating pointbusiness.industryQR decompositionsymbols.namesakeMatrix (mathematics)Gaussian eliminationVectorization (mathematics)symbolsGenerator matrixbusinessFixed-point arithmeticAlgorithmComputer hardwareMathematicsSparse matrix2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012)

researchProduct

A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks

2019

New chips for machine learning applications appear, they are tuned for a specific topology, being efficient by using highly parallel designs at the cost of high power or large complex devices. However, the computational demands of deep neural networks require flexible and efficient hardware architectures able to fit different applications, neural network types, number of inputs, outputs, layers, and units in each layer, making the migration from software to hardware easy. This paper describes novel hardware implementing any feedforward neural network (FFNN): multilayer perceptron, autoencoder, and logistic regression. The architecture admits an arbitrary input and output number, units in la…

Hardware architectureFloating pointGeneral Computer ScienceArtificial neural networkComputer scienceClock rateActivation functionGeneral EngineeringSistemes informàticsAutoencoderArquitectura d'ordinadorsComputational scienceneural network accelerationFPGA implementationdeep neural networksMultilayer perceptronFeedforward neural networks - FFNNFeedforward neural networkXarxes neuronals (Informàtica)General Materials Sciencelcsh:Electrical engineering. Electronics. Nuclear engineeringlcsh:TK1-9971systolic hardware architectureIEEE Access

researchProduct

Accelerated fluctuation analysis by graphic cards and complex pattern formation in financial markets

2009

The compute unified device architecture is an almost conventional programming approach for managing computations on a graphics processing unit (GPU) as a data-parallel computing device. With a maximum number of 240 cores in combination with a high memory bandwidth, a recent GPU offers resources for computational physics. We apply this technology to methods of fluctuation analysis, which includes determination of the scaling behavior of a stochastic process and the equilibrium autocorrelation function. Additionally, the recently introduced pattern formation conformity (Preis T et al 2008 Europhys. Lett. 82 68005), which quantifies pattern-based complex short-time correlations of a time serie…

PhysicsFloating pointSeries (mathematics)Stochastic processAutocorrelationGraphics processing unitGeneral Physics and AstronomyMemory bandwidthCentral processing unitScalingComputational scienceNew Journal of Physics

researchProduct